/*
The analyses are organized for the specific results discussed
in the section titled: 

How do moderation hypotheses fare?
*/

use "../Metadata/tess_analysisdata.dta", clear

** overall rate of moderation hypotheses
	ta moderation if insample == 1

** comparing success of moderation vs non-moderation hypotheses
	prtest hyp_true if insample == 1, by(moderation)

*** Comparing power for different relative sizes of interaction effect

	/*
	Gelman blog post makes this simple: 
	https://statmodeling.stat.columbia.edu/2018/03/15/need16/

	He shows that main effect standard error is 2*sigma / sqrt(N) 
	and moderator is 4*sigma / sqrt(N)
	*/
	
	*1. we figure out the z-score associated with 90% power

	display invnorm(.90) + 1.96
	//3.2415516

	*2.  Confirm this works:
	display normal(3.24-1.96)
	//.89972743

	*3.  In the case of a moderator of equal size as main effect, this is like 
	*doubling the standard error, or halving the z-score.

	display normal((3.24/2)-1.96)
	//.36692826

	*4.  In the case of a moderator that is half the size of the main effect, 	
	*this is like dividing the z-score by 4:

	display normal((3.24/4)-1.96)
	//.12507194

	
** median size of main effect vs. moderation -- .08 vs .04
	table hyp_type if insample==1, stat(median d)
	

** % of moderation hypotheses with N< 1000 with significant results 
	* with bonferroni correction - 4.4%
	ta hyp_true if moderation == 1 & N_person < 1000 & insample == 1	

	
** % of main effect hypotheses with N< 1000 with significant results
	ta hyp_true if hyp_type==1 & N_person < 1000 & insample == 1	
	
	
** % of moderation hypotheses with N< 1000 with significant results 	
	* without bonferroni correction - 6.1%
	tempname hyp_true_nobonf 
	gen `hyp_true_nobonf' = (twop < .05)	
	ta `hyp_true_nobonf' if moderation == 1 & N_person < 1000 & insample == 1	
	
	

** Bonferroni-corrected rate for samples with N>=2000: % of moderation hypotheses with significant results 

	* with bonferroni correction - 20%
	ta hyp_true if moderation == 1 & N_person >= 2000 & insample == 1	


** Larger N studies are not more likely to involve moderator hypotheses	
	pwcorr moderation N_person if insample == 1, sig


** Studies with a lot of moderators

	* # of studies with twelve or more moderator hypotheses - 8
	egen moderator_count = sum(moderation), by(proposal_id)
	ta moderator_count if hyp_num == 1

	** rates of supported hypotheses in studies with 12+ moderation tests

	* without bonferroni correction - 5%
	tempname hyp_true_nobonf 
	gen `hyp_true_nobonf' = (twop < .05)	
	ta `hyp_true_nobonf' if moderator_count >= 12 & moderation == 1 & insample == 1
	
	